Spatial prisoner's dilemma game with volunteering in Newman- Watts small-world networks 
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A modified spatial prisoner's dilemma game with voluntary participation in Newman-Watts small-world net- 
works is studied. Some reasonable ingredients are introduced to the game evolutionary dynamics: each agent 
in the network is a pure strategist and can only take one of three strategies (cooperator, defector, and loner); its 
strategical transformation is associated with both the number of strategical states and the magnitude of average 
profits, which are adopted and acquired by its coplayers in the previous round of play; a stochastic strategy 
mutation is applied when it gets into the trouble of local commons that the agent and its neighbors are in the 
same state and get the same average payoffs. In the case of very low temptation to defect, it is found that agents 
are willing to participate in the game in typical small-world region and intensive collective oscillations arise in 
more random region. 
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PACS numbers: 02.50.Le, 87.23.Kg, 87.23.Ge, 89.75.Hc 

There has been a long history of studying complex behav- 
iors qualitatively of biological, ecological, social and eco- 
nomic systems using special game models. After the pris- 
oner's dilemma game (PDG) was first applied by Neumann 
and Morgenstern [ 1 ] to study economic behavior, great de- 
velopments have been made by a lot of subsequent studies. 
Recently, more and more attentions have been focused on the 
applications of the PDG in the fields of biology |2], economy 
ecology 0|, and other domains 0|. Game theory and evo- 
lutionary theory provide a powerful metaphor for simulating 
the interactions of individuals in these systems 

Most realistic systems can be regarded as composing of 
a large number of individuals with simple local interactions. 
For example, human beings are limited in territory and inter- 
act more frequently with their neighbors than those far away. 
Therefore, the spatial structure may greatly affect their activ- 
ities. Since Axelrod [7] suggested ideas of the PDG on a lat- 
tice, spatial prisoner's dilemma games (SPDG) have been ex- 
tensively explored in various kinds of network models in the 
past few years, including regular lattices II II Hi, random 
regular graphs 1 1 1], random networks with fixed mean degree 
distribution ill 211 . small- world networks JT3lll4l fl5ll and real- 
world acquaintance networks |[16[, etc. In the general SPDG, 
each agent can take one of two strategies (or states): cooper- 
ator (C) and defector (D). There are four possible combina- 
tions: (C, C), (C, D), (D, C) and (D, D), which get payoffs 
(r, r), (s, t), (t, s), and (p, p), respectively. The parameters 
satisfy the conditions t > r > p > s and 2r > t + s, so that 
lead to a so-called dilemma situation where mutual trust and 
cooperation is beneficial in a long perspective but egoism and 
guile can produce big short-term profit. Agents update their 
states by imitating the strategy of the wealthiest among their 
neighborhoods in subsequent plays. The system is easy to get 
into an absorbing state: all agents are D for large values of t, 
which is known as the tragedy of the commons I17II . 

Recently, Szabo et al. SQUIll developed the SPDG with 
voluntary participation, in which agents can take one of three 



possible strategies, cooperator, defector and loner (L). Coop- 
erators and defectors are interested in taking part in the game 
and the payoffs for their encounters are assigned as before. 
Loners do not participate in the game temporarily and get the 
same small fixed income a (a < r < t) as their neighbors. 
Thus the payoff matrix can be tabulated as 
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Each element in the matrix denotes the corresponding payoff 
of an agent adopting the strategy of the left and encountering 
an agent performing the strategy of the above. In the volun- 
teers version, the three strategies can coexist by cyclic domi- 
nance (D invades C invades L invades D), which efficiently 
avoid the system getting into a frozen state. 

In this Brief Report, we study the SPDG with voluntary 
participation in the Newman- Watts (NW) network, which is 
a typical small-world model constructed as follows: starting 
with a two-dimensional lattice with periodic boundary condi- 
tions; each agent locates on the lattice and links with its four 
nearest neighbors; for every agent, with probability Q, we add 
a long range link for each its four links to a random selected 
agent from the whole system with duplicate links forbidden; 
then a NW network is realized (see Ref. 1 18] for details). The 
structural characteristics of social communities, namely, high 
clustering and small diameter, can be well described by this 
small-world graph. A round of play consists of the encoun- 
ters of all agents with their nearest neighbors. Following Ref. 
11411 . the payoffs earned by the agents are calculated as aver- 
age and not accumulated from round to round. To start the 
next round, agents are allowed to inspect the profits collected 
by their neighbors and adjust their strategies. 

We argue that the ingredients for agents changing then- 
states mainly come from two aspects: (i) For the sake of pur- 
suing higher profits, agents have a trend to follow the success- 
ful agents who get higher payoffs, i.e., "successful"strategies 
are imitated. We figure that ith agent adopts the strategy of its 
arbitrary neighbor j with a probability 
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FIG. 1: The evolution of the density of defectors (pd) with varied 
values of (RTQ, Q) under the equilibrium state: (a) form top to bot- 
tom, the curves correspond to (0.02, 0.1), (0.56, 0.1), (0.56, 0.5) 
respectively; and (b) (0.02, 0.5). 
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FIG. 2: MC data of the density of defectors as a function of the 
network's structure parameter Q under different values of RTQ: 0.02 
(a), 0.1 (b), 0.2 (c) and 0.8 (d). Closed squares represent the average 
density of defectors; open circles and triangles show their maximal 
and minimal values due to oscillation. 



where gj denotes the average profit earned by jth agent and SI, 
is the community composing of the nearest neighbors of i and 
itself; (ii) When one agent and its neighbors are in the same 
state and get the same average payoffs, it has a spontaneous 
willing to make some mutations. We propose that the agents 
getting into the above case make spontaneous alterations with 
a probability depending on the elements of the payoff matrix. 
If the agent under consideration is C, in the next round, the 
probabilities for its changing to C, D, or L are r/(r + 1 + a), 
t/(r + t + er), and <r/(r + t + a) respectively; if the agent is 
D, the probabilities for its changing to C, D, or L are s/(s + 
p + a), p/ (s + p + a), and u/(s + p + a) respectively; and 
if the agent is L, the probabilities of its changing to C, D, 
or L are the same value and equal to 1/3. This spontaneous 
mutational mechanism not only efficiently avoids the system 
getting into a frozen state but also sufficiently describes the 
agents' flexibility. 

Our analysis of the model is based on systematic Monte 
Carlo (MC) simulations performed in different NW networks 
with the total size of 200 x 200 populations. The three strate- 
gies are assigned randomly to the agents with probability 1/3 
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FIG. 3: MC data of the density of defectors as a function of RTQ 
under different values of the network's structure parameter Q: 0.001 
(a), 0.1 (b), 0.5 (c) and 1.0 (d). The symbols as shown in Fig. [2] 
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FIG. 4: The density of cooperators (a), defectors (b) and loners 
(c) vs the network's structure parameter Q under different values of 
RTQ. The symbols of open squares, closed circles, open triangles, 
closed diamonds and open stars correspond to the value of RTQ: 
0.02, 0.1, 0.2, 0.56, and 0.8, respectively. 



initially. For convenience, following Refs. 
we set s = p = 0, r = 1, a = 0.3, and 1 < t < 2. 
We define t — r as the relative temptation quantity (shortly 
RTQ) reflecting the extent of the temptation and cursorily par- 
tition the networks into three regions: lattice, small-world and 
random graphs corresponding to the variational range of Q: 
(0.0001, 0.001), (0.001, 0.3) and (0.3, 1) respectively. We it- 
erate the rules of the model with parallel updating. The to- 
tal sampling times are 5000 MC steps. After appropriate re- 
laxation times the system stabilizes in dynamical equilibrium 
characterized by their densities of pc, Pd, Pl and average 
payoffs Pc, Pd, Pl- According to the previous assumption, 
it is easy to know that Pl is always equals to a. All the results 
are averaged over the realizations of ten networks. 

The main features of the steady-state phase diagram can be 
summarized as follows. All three states coexist and coevolve 
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FIG. 5: Average payoffs of cooperators (a) and defectors (b) vs the 
network's structure parameter Q under different values of RTQ: 0.02, 
0.1, 0.2, 0.56, and 0.8. The symbols are the same as shown in Fig. 
|4| and the dotted line indicates the fixed average payoff of loners. 



steadily in equilibrium state. For large values of Q with very 
small values of RTQ, strong global oscillations arise, which is 
similar to the phenomena studied in Ref. 1 13] for high tempta- 
tion to defect. The bifurcation of pn for large values of temp- 
tation studied in Refs. IllllOl . however, does not arise in our 
model. For small values of Q with arbitrary values of RTQ 
or large values of RTQ with arbitrary values of Q, the station- 
ary state is characterized by a weak global oscillation where 
the amplitude of fluctuation is significantly less than the cor- 
responding average value. As a distinct view, in Fig. the 
last 2000 steps' evolution of po under values of Q (0.1 and 
0.5) and RTQ (0.02 and 0.56) has been tracked (the evolution 
of pc and p^ are similar to p£>); the average values of pu and 
the corresponding maximum and minimum deviation in the 
steady state are also reported in Fig. 0for fixed values of RTQ 
(0.02,0.1,0.2,0.8) with varied values of Q E (0.0001 ~ 1.0) 
and in Fig. 0for fixed values of Q (0.001, 0.1, 0.5, 1.0) with 
varied values of RTQ G (0.0 ~ 1.0). These phenomena can 
be explained as follows. 

During the process of the evolution, defectors can not form 
stable large clusters, of which the inner agents would get zero 
profit and possess the same state as their neighbors (local 
commons). According to the evolutionary rules, they will 
try to throw off embarrassment by changing their strategies. 
Namely, the easy formation of clusters of D will make the 
agents self-adapt frequently in their communities, and then 
confine the fluctuation of po in a narrow range [see Fig. 
da), Fig. 0b), Fig. 0c), Fig. 0d), Fig. 0a), Fig. 0b)]. 
There are two factors favoring the forming of clusters of de- 
fectors: the high temptation to defect (large values of RTQ) 
and the well clustered structure of the agents (small values of 
Q), which would strengthen the adoption and the imitation of 
strategy D greatly. Therefore, in our model, high temptation 
to defect will only give rise to steady oscillation of the sys- 
tem rather than result in the bifurcation phenomena studied in 
Refs. fill [1311 . While for poorly clustered agents (large val- 
ues of Q) with low temptation, the formation of large clusters 
of defectors is reasonably difficult, which would slow down 



the evolutionary velocity of the whole system and guarantee 
the growth (decline) of po lasting for a long time, and con- 
sequently broaden the fluctuant amplitude (see Fig. 0b), Fig. 
0a), Fig. 0c) and Fig. 0d). 

In addition, in the lattice region, pu keeps a steady level 
for any values of RTQ [see Fig. Fig. 0a) and Fig. 0b)]. It 
is also a result of the fast self-adaptation of the agents. With 
the increasing of RTQ, agents of C are easy to change to D 
for high temptation, and then again change to L because clus- 
ters of defectors are extremely unstable and can not survive a 
long time. The decrease of pc nearly results in the increas- 
ing of pl [see Fig. 0a) and Fig. 0c)]. In this region, the 
fast self-adaptation of the agents also leads to the case that the 
neighbors of defectors would include other types of agents in 
most time during the evolution, which gives rise to larger val- 
ues of Pd than Pi,. By comparison, in Refs. flllfl3ll . very big 
clusters of defectors can survive a long time during the evo- 
lution and most agents would get only the zero payoff result- 
ing in lower average payoffs of the defectors than the loners. 
It is obvious that the differences in the evolutionary dynam- 
ics of the game give rise to the distinct results. It is worth 
mentioning that the present model is also different from the 
cyclic spatial games studied in Ref. 1 19] where the dynamics 
evolution is governed by a strictly cyclic dominance, i.e., rock 
dominates scissors dominates paper dominates rock. While in 
our model, any two types of the three strategies can transform 
each other in particular case. As a result of the difference in 
evolutionary dynamics, the phase transitions phenomena stud- 
ied in Ref. 1 19] for rock-scissors-paper games do not arise in 
our model. 

Another interesting feature of the equilibrium phase dia- 
gram is that in the vicinity of Q = 0.1 where the NW net- 
works possess notable small-world effect, namely, large clus- 
tering and small diameter at the same time, agents are willing 
to participation in the game in the case of very low tempta- 
tion to defect. To view in detail, in Fig0and Fig0 we plot 
the average density and corresponding average payoffs of C- 
D-L vs the small-world parameter Q under different values 
of RTQ respectively. For very low temptation to defect (e.g. 
RTQ= 0.02), the evolutionary curve of pi decreases slowly 
with the increasing of Q and reaches a minimum at certain 
culminating point. As Q increases over this point, pi ascends 
rapidly (see Fig. 0. We conclude that two factors, the very 
low temptation to defect and the small-world property of the 
network, are beneficial for the spreading of C in the system, 
which then stimulates more and more agents to take part in 
the game. In the case of more random networks (Q — > 1), 
the evolutionary results of the game are qualitatively the same 
as Refs. (THUS]! i.e., the majority of members in the system 
are loners and the values of Pc and Pd get closed to the fixed 
value a. 

In summary, we have studied the SPDG with voluntary par- 
ticipation in NW small-world networks. To model the realistic 
social systems, some reasonable ingredients are introduced to 
the evolutionary dynamics: each agent in the networks is a 
pure strategist and can only take one of three strategies (C, 
D, L); its strategical transformation is associated with both 
the number of strategical states and the magnitude of average 



4 



profits, which are adopted and acquired by its coplayers in the 
previous round of play. To model initiative and flexibility, a 
stochastic strategy change is applied when the agents get into 
the condition of local commons. The agents self-adapt and 
self-organize into dynamical equilibrium after a short tran- 
sient. When the agents are well structured (the cases of small 
values of Q), they can steadily coexist and coevolve. On the 
other hand, for high temptation or more random networks, 
loners dominate the network. Especially, in the case of very 



low temptation to defect, it is found that agents are willing 
to participate in the game in typical small-world region and 
intensive collective oscillations arise in more random region. 
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